It is best to install packages (e.g.,
install.packages("cowplot")) from the command line only. If you include this code in a R code chunk, you risk R trying to reinstall the package each time you knit your file.
It is best to install packages (e.g.,
install.packages("cowplot")) from the command line only. If you include this code in a R code chunk, you risk R trying to reinstall the package each time you knit your file.
When discussing prior knowledge related to Bayesian methods; is that the same as things like genomic selection where a Genomic Breeding Value is assigned based on training data sets to inform a model on how to handle the experimental data sets?
\[P[\Theta | Data] = \frac{P[Data | \Theta] \times P[\Theta]}{P[Data]}\]
M <- read_excel("../data/Mouse_Weaning_Data.xlsx") %>%
select(MouseID, Sex, WnMass) %>%
drop_na() %>%
mutate(Sex_f = factor(if_else(Sex == 0, "Female", "Male")))
str(M)
## tibble [2,429 x 4] (S3: tbl_df/tbl/data.frame) ## $ MouseID: num [1:2429] 1 2 3 4 5 6 7 8 9 10 ... ## $ Sex : num [1:2429] 0 1 1 1 1 0 0 1 1 1 ... ## $ WnMass : num [1:2429] 15.3 12.5 17.4 16.6 16.8 ... ## $ Sex_f : Factor w/ 2 levels "Female","Male": 1 2 2 2 2 1 1 2 2 2 ...
After 3 generations
M_3 <- M %>% filter(MouseID != -9) %>% filter(MouseID < 400) %>% mutate(Sex_f = factor(if_else(Sex == 0, "Female", "Male"))) str(M_3)
## tibble [191 x 4] (S3: tbl_df/tbl/data.frame) ## $ MouseID: num [1:191] 1 2 3 4 5 6 7 8 9 10 ... ## $ Sex : num [1:191] 0 1 1 1 1 0 0 1 1 1 ... ## $ WnMass : num [1:191] 15.3 12.5 17.4 16.6 16.8 ... ## $ Sex_f : Factor w/ 2 levels "Female","Male": 1 2 2 2 2 1 1 2 2 2 ...
M_3 %>% group_by(Sex_f) %>% tally()
## # A tibble: 2 x 2 ## Sex_f n ## * <fct> <int> ## 1 Female 91 ## 2 Male 100
For a set of observations (\(Y_i\)) and hypothesized parameters (\(\Theta\)) the model likelihood is the product of the observations’ individual likelihoods:
\[\mathcal{L}\left(\left\{ Y_{i}\right\} _{i=1}^{n};\Theta\right) = \prod_{i=1}^{n}Pr\left[Y_{i}; \Theta\right]\]
Evaluate the likelihood function for different values of \(\Theta\) to estimate \(\mathcal{L}\) for different sets of \(\Theta\).
\[\mathcal{L}\left(\left\{ Y_{i}\right\} _{i=1}^{n};\Theta\right) = \prod_{i=1}^{n}Pr\left[Y_{i}; \Theta\right]\]
\[\log\left(\mathcal{L}\left(\left\{ Y_{i}\right\} _{i=1}^{n};\Theta\right)\right) = \sum_{i=1}^{n} \log\left(Pr\left[Y_{i};\Theta\right]\right)\]
So we just need to sum the log-likelihoods of the observations to get the model likelihood.
Generate a range of possible values for \(\theta\):
theta <- seq(0.001, 0.999, length = 200)
Calculate the probability for 91 female mice from 191 total mice for each value of \(\theta\):
pr <- dbinom(91, 191, prob = theta)
Convert to log-likelihoods:
log_lik <- log(pr) log_liks <- tibble(theta, log_lik)
log_liks
## # A tibble: 200 x 2 ## theta log_lik ## <dbl> <dbl> ## 1 0.001 -499. ## 2 0.00602 -337. ## 3 0.0110 -282. ## 4 0.0160 -248. ## 5 0.0211 -224. ## 6 0.0261 -205. ## 7 0.0311 -190. ## 8 0.0361 -177. ## 9 0.0411 -165. ## 10 0.0461 -155. ## # ... with 190 more rows
max(log_lik)
## [1] -2.852506
theta[which.max(log_lik)]
## [1] 0.4774322
91 / 191
## [1] 0.4764398
Assuming \(\theta = 0.5\)
sum(dbinom(0:91, 191, 0.5))
## [1] 0.2813992
M %>% group_by(Sex_f) %>% tally()
## # A tibble: 2 x 2 ## Sex_f n ## * <fct> <int> ## 1 Female 1202 ## 2 Male 1227
pr <- dbinom(1202, 1202 + 1227, prob = theta)
Convert to log-likelihoods:
log_lik <- log(pr) log_liks <- tibble(theta, log_lik)
log_liks
## # A tibble: 200 x 2 ## theta log_lik ## <dbl> <dbl> ## 1 0.001 -Inf ## 2 0.00602 -Inf ## 3 0.0110 -Inf ## 4 0.0160 -Inf ## 5 0.0211 -Inf ## 6 0.0261 -Inf ## 7 0.0311 -Inf ## 8 0.0361 -Inf ## 9 0.0411 -Inf ## 10 0.0461 -Inf ## # ... with 190 more rows
max(log_lik)
## [1] -4.1509
theta[which.max(log_lik)]
## [1] 0.4924774
1202 / (1202 + 1227)
## [1] 0.4948538
M <- read_excel("../data/Mouse_Weaning_Data.xlsx") %>%
select(MouseID, Sex, WnMass) %>%
drop_na() %>%
mutate(Sex_f = factor(if_else(Sex == 0, "Female", "Male")))
str(M)
## tibble [2,429 x 4] (S3: tbl_df/tbl/data.frame) ## $ MouseID: num [1:2429] 1 2 3 4 5 6 7 8 9 10 ... ## $ Sex : num [1:2429] 0 1 1 1 1 0 0 1 1 1 ... ## $ WnMass : num [1:2429] 15.3 12.5 17.4 16.6 16.8 ... ## $ Sex_f : Factor w/ 2 levels "Female","Male": 1 2 2 2 2 1 1 2 2 2 ...
M %>%
ggplot(aes(WnMass, color = Sex_f)) +
geom_line(stat = "density", size = 1.5) +
facet_grid(Sex_f ~ .) +
scale_color_manual(values = c("Orange", "Purple"), name = "Sex") +
labs(x = "Wean Mass (g)", y = "Density", title = "Weaning Mass")
Analytical mean:
Males <- M %>% filter(Sex_f == "Male") Males %>% summarize(mean_mass = mean(WnMass))
## # A tibble: 1 x 1 ## mean_mass ## <dbl> ## 1 13.8
\[Pr\left[Y_i; \mu, \sigma\right] = \frac{1}{\sqrt{2\pi\sigma^{2}}} e^{\frac{-\left(Y_i-\mu\right)^{2}}{2\sigma^{2}}}\]
What values of \(\mu\) and \(\sigma\) simultaneously maximize the likelihood for all the observations?
For a set of observations (\(Y_i\)) and hypothesized parameters (\(\Theta\), here \(\mu\) and \(\sigma\)) the model likelihood is the product of the observations’ individual likelihoods:
\[\mathcal{L}\left(\left\{ Y_{i}\right\} _{i=1}^{n};\Theta\right) = \prod_{i=1}^{n}Pr\left[Y_{i}; \Theta\right]\]
\[\log\left(\mathcal{L}\left(\left\{ Y_{i}\right\} _{i=1}^{n};\Theta\right)\right) = \sum_{i=1}^{n} \log\left(Pr\left[Y_{i};\Theta\right]\right)\]
Search the parameter space of \(\Theta\) (here \(\mu\) and \(\sigma\)) for the values that maxmimize the likelihood. For each possible \(\mu\) and \(\sigma\):
The analytical mean equals the ML estimate of the mean.
log_lik <- function(x, Y){
log_liks <- dnorm(Y, mean = x[1], sd = x[2], log = TRUE)
return(sum(log_liks))
}
x <- c(mu, sigma) and vector of Y valuesY given the mean and standard deviationFor combinations of \(\mu\) and \(\sigma\), calculate the model likelihood. Pick the largest log-likelihood as the maximum likelihood estimates.
Set up the grid:
n <- 100 # How fine is the grid mu <- seq(5, 20, length = n) # Vector of mu sigma <- seq(0.1, 5, length = n) # Vector of sigma grid_approx <- crossing(mu, sigma) %>% mutate(log_lik = 0)
grid_approx
## # A tibble: 10,000 x 3 ## mu sigma log_lik ## <dbl> <dbl> <dbl> ## 1 5 0.1 0 ## 2 5 0.149 0 ## 3 5 0.199 0 ## 4 5 0.248 0 ## 5 5 0.298 0 ## 6 5 0.347 0 ## 7 5 0.397 0 ## 8 5 0.446 0 ## 9 5 0.496 0 ## 10 5 0.545 0 ## # ... with 9,990 more rows
for (ii in 1:nrow(grid_approx)) {
grid_approx[ii, "log_lik"] <-
log_lik(c(grid_approx$mu[ii], grid_approx$sigma[ii]),
Males$WnMass)
}
grid_approxmu and sigma to log_likhead(grid_approx)
## # A tibble: 6 x 3 ## mu sigma log_lik ## <dbl> <dbl> <dbl> ## 1 5 0.1 -5144773. ## 2 5 0.149 -2301597. ## 3 5 0.199 -1298860. ## 4 5 0.248 -832927. ## 5 5 0.298 -579252. ## 6 5 0.347 -426079.
This approach is coarse, time consuming, and not feasible for fine grids or many parameters.
On this grid, the maximum likelihood estimate of \(\mu\) is:
grid_approx[which.max(grid_approx$log_lik), ]
## mu sigma log_lik ## 4228 13.78788 2.475758 -2845.812
The analytical estimate is:
mean(Males$WnMass)
## [1] 13.82238
reltol says to stop when the improvement is \(<10^{-100}\).
MLE <- optim(c(13, 1), # Start at mu = 13, sigma = 1
log_lik,
Y = Males$WnMass,
control = list(fnscale = -1, # Find maximum
reltol = 10^-100))
mean(Males$WnMass)
## [1] 13.82238
print(MLE)
## $par ## [1] 13.822380 2.460218 ## ## $value ## [1] -2845.644 ## ## $counts ## function gradient ## 111 NA ## ## $convergence ## [1] 0 ## ## $message ## NULL
glm() uses an optimization routineset.seed(36428)
reps <- 100000
resampled_mean <- numeric(length = reps)
for (i in 1:reps) {
resampled_mean[i] <- mean(sample(Males$WnMass,
replace = TRUE))
}
Each iteration:
Males$WnMass with replacement. Yields a vector equal in length to Males$WnMass but where each value can appear more than once.resampled_mean.The mean of these means is the estimate of the population mean (via central limit theorem).
mean(Males$WnMass)
## [1] 13.82238
mean(resampled_mean)
## [1] 13.82238
Ranges of possible maximum likelihood values:
Drawbacks:
Can we do better? Yes, Bayesian priors.
R, Python, MATLAB, etc. interfaces to samplers #2-4.
rstan: Build your own modelsrethinking: Convenience interface to rstan.brms and rstanarm: High level model syntaxfm_sd5 <- ulam(
alist(
WnMass ~ dnorm(mu, sigma),
mu ~ dnorm(14, 5),
sigma ~ dcauchy(0, 3)
),
data = Males_sub
)
## ## SAMPLING FOR MODEL '87bc108e88ace8f48ed0725b0d6d7d0e' NOW (CHAIN 1). ## Chain 1: Rejecting initial value: ## Chain 1: Error evaluating the log probability at the initial value. ## Chain 1: Exception: normal_lpdf: Scale parameter is -1.01634, but must be > 0! (in 'model44cce6d7376_87bc108e88ace8f48ed0725b0d6d7d0e' at line 11) ## ## Chain 1: ## Chain 1: Gradient evaluation took 0 seconds ## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds. ## Chain 1: Adjust your expectations accordingly! ## Chain 1: ## Chain 1: ## Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup) ## Chain 1: Iteration: 100 / 1000 [ 10%] (Warmup) ## Chain 1: Iteration: 200 / 1000 [ 20%] (Warmup) ## Chain 1: Iteration: 300 / 1000 [ 30%] (Warmup) ## Chain 1: Iteration: 400 / 1000 [ 40%] (Warmup) ## Chain 1: Iteration: 500 / 1000 [ 50%] (Warmup) ## Chain 1: Iteration: 501 / 1000 [ 50%] (Sampling) ## Chain 1: Iteration: 600 / 1000 [ 60%] (Sampling) ## Chain 1: Iteration: 700 / 1000 [ 70%] (Sampling) ## Chain 1: Iteration: 800 / 1000 [ 80%] (Sampling) ## Chain 1: Iteration: 900 / 1000 [ 90%] (Sampling) ## Chain 1: Iteration: 1000 / 1000 [100%] (Sampling) ## Chain 1: ## Chain 1: Elapsed Time: 0.038 seconds (Warm-up) ## Chain 1: 0.026 seconds (Sampling) ## Chain 1: 0.064 seconds (Total) ## Chain 1:
mean(Males_sub$WnMass)
## [1] 13.82238
precis(fm_sd5)
## mean sd 5.5% 94.5% n_eff Rhat4 ## mu 13.818507 0.06727362 13.711583 13.927994 242.0024 1.0038579 ## sigma 2.460989 0.04948409 2.382388 2.539553 320.0294 0.9989122
fm_sd10 <- map2stan(
alist(
WnMass ~ dnorm(mu, sigma),
mu ~ dnorm(14, 10),
sigma ~ dcauchy(0, 3)
),
data = Males_sub
)
## ## SAMPLING FOR MODEL '506ccf234967eaf90497b783b32a95ff' NOW (CHAIN 1). ## Chain 1: ## Chain 1: Gradient evaluation took 0 seconds ## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds. ## Chain 1: Adjust your expectations accordingly! ## Chain 1: ## Chain 1: ## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup) ## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup) ## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup) ## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup) ## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup) ## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup) ## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling) ## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling) ## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling) ## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling) ## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling) ## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling) ## Chain 1: ## Chain 1: Elapsed Time: 0.05 seconds (Warm-up) ## Chain 1: 0.048 seconds (Sampling) ## Chain 1: 0.098 seconds (Total) ## Chain 1:
## Computing WAIC
## Error in .local(fit, data, n, ...) : ## There appear to be no linear models here
mean(Males_sub$WnMass)
## [1] 13.82238
precis(fm_sd10)
## mean sd 5.5% 94.5% n_eff Rhat4 ## mu 13.823756 0.06891462 13.718320 13.929705 601.2564 0.9999429 ## sigma 2.463075 0.04867016 2.386445 2.543367 567.7990 1.0016267
fm_mu10 <- map2stan(
alist(
WnMass ~ dnorm(mu, sigma),
mu ~ dnorm(10, 5),
sigma ~ dcauchy(0, 3)
),
data = Males_sub
)
## ## SAMPLING FOR MODEL 'c17be1a859b3f05ea6a9ed4d2290bba4' NOW (CHAIN 1). ## Chain 1: ## Chain 1: Gradient evaluation took 0 seconds ## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds. ## Chain 1: Adjust your expectations accordingly! ## Chain 1: ## Chain 1: ## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup) ## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup) ## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup) ## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup) ## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup) ## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup) ## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling) ## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling) ## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling) ## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling) ## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling) ## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling) ## Chain 1: ## Chain 1: Elapsed Time: 0.054 seconds (Warm-up) ## Chain 1: 0.05 seconds (Sampling) ## Chain 1: 0.104 seconds (Total) ## Chain 1:
## Computing WAIC
## Error in .local(fit, data, n, ...) : ## There appear to be no linear models here
mean(Males_sub$WnMass)
## [1] 13.82238
nrow(Males)
## [1] 1227
precis(fm_mu10)
## mean sd 5.5% 94.5% n_eff Rhat4 ## mu 13.819822 0.07090982 13.703792 13.933466 605.9309 1.0008821 ## sigma 2.464692 0.05389681 2.382107 2.555207 705.0477 0.9994903
fm_sd0.1 <- map2stan(
alist(
WnMass ~ dnorm(mu, sigma),
mu ~ dnorm(14, 0.1),
sigma ~ dcauchy(0, 3)
),
data = Males_sub
)
## ## SAMPLING FOR MODEL '8082e8efa66a556437ce4fe9d7803a53' NOW (CHAIN 1). ## Chain 1: ## Chain 1: Gradient evaluation took 0 seconds ## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds. ## Chain 1: Adjust your expectations accordingly! ## Chain 1: ## Chain 1: ## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup) ## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup) ## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup) ## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup) ## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup) ## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup) ## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling) ## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling) ## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling) ## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling) ## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling) ## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling) ## Chain 1: ## Chain 1: Elapsed Time: 0.054 seconds (Warm-up) ## Chain 1: 0.055 seconds (Sampling) ## Chain 1: 0.109 seconds (Total) ## Chain 1:
## Computing WAIC
## Error in .local(fit, data, n, ...) : ## There appear to be no linear models here
mean(Males_sub$WnMass)
## [1] 13.82238
precis(fm_sd0.1)
## mean sd 5.5% 94.5% n_eff Rhat4 ## mu 13.879698 0.05778947 13.787619 13.970305 703.2480 0.9996272 ## sigma 2.460888 0.04843463 2.386658 2.539512 802.6935 1.0082159